A New Editing Scheme Based on a Fast Two-String Median Computation Applied to OCR
Identifieur interne : 000759 ( Main/Exploration ); précédent : 000758; suivant : 000760A New Editing Scheme Based on a Fast Two-String Median Computation Applied to OCR
Auteurs : Ignacio Abreu Salas [Cuba] ; Ram N Rico-Juan [Espagne]Source :
- Lecture Notes in Computer Science [ 0302-9743 ] ; 2010.
Abstract
Abstract: This paper presents a new fast algorithm to compute an approximation to the median between two strings of characters representing a 2D shape and its application to a new classification scheme to decrease its error rate. The median string results from the application of certain edit operations from the minimum cost edit sequence to one of the original strings. The new dataset editing scheme relaxes the criterion to delete instances proposed by the Wilson Editing Procedure. In practice, not all instances misclassified by its near neighbors are pruned. Instead, an artificial instance is added to the dataset expecting to successfully classify the instance on the future. The new artificial instance is the median from the misclassified sample and its same-class nearest neighbor. The experiments over two widely used datasets of handwritten characters show this preprocessing scheme can reduce the classification error in about 78% of trials.
Url:
DOI: 10.1007/978-3-642-14980-1_74
Affiliations:
Links toward previous steps (curation, corpus...)
- to stream Istex, to step Corpus: 002048
- to stream Istex, to step Curation: 001F10
- to stream Istex, to step Checkpoint: 000339
- to stream Main, to step Merge: 000764
- to stream Main, to step Curation: 000759
Le document en format XML
<record><TEI wicri:istexFullTextTei="biblStruct"><teiHeader><fileDesc><titleStmt><title xml:lang="en">A New Editing Scheme Based on a Fast Two-String Median Computation Applied to OCR</title>
<author><name sortKey="Abreu Salas, Ignacio" sort="Abreu Salas, Ignacio" uniqKey="Abreu Salas I" first="Ignacio" last="Abreu Salas">Ignacio Abreu Salas</name>
</author>
<author><name sortKey="Rico Juan, Ram N" sort="Rico Juan, Ram N" uniqKey="Rico Juan R" first="Ram N" last="Rico-Juan">Ram N Rico-Juan</name>
</author>
</titleStmt>
<publicationStmt><idno type="wicri:source">ISTEX</idno>
<idno type="RBID">ISTEX:492AFA211AC0CD68A47DF8F47C37AC9C5B0ED5F4</idno>
<date when="2010" year="2010">2010</date>
<idno type="doi">10.1007/978-3-642-14980-1_74</idno>
<idno type="url">https://api.istex.fr/document/492AFA211AC0CD68A47DF8F47C37AC9C5B0ED5F4/fulltext/pdf</idno>
<idno type="wicri:Area/Istex/Corpus">002048</idno>
<idno type="wicri:Area/Istex/Curation">001F10</idno>
<idno type="wicri:Area/Istex/Checkpoint">000339</idno>
<idno type="wicri:doubleKey">0302-9743:2010:Abreu Salas I:a:new:editing</idno>
<idno type="wicri:Area/Main/Merge">000764</idno>
<idno type="wicri:Area/Main/Curation">000759</idno>
<idno type="wicri:Area/Main/Exploration">000759</idno>
</publicationStmt>
<sourceDesc><biblStruct><analytic><title level="a" type="main" xml:lang="en">A New Editing Scheme Based on a Fast Two-String Median Computation Applied to OCR</title>
<author><name sortKey="Abreu Salas, Ignacio" sort="Abreu Salas, Ignacio" uniqKey="Abreu Salas I" first="Ignacio" last="Abreu Salas">Ignacio Abreu Salas</name>
<affiliation wicri:level="1"><country xml:lang="fr">Cuba</country>
<wicri:regionArea>Universidad de Matanzas</wicri:regionArea>
</affiliation>
<affiliation wicri:level="1"><country wicri:rule="url">Cuba</country>
</affiliation>
</author>
<author><name sortKey="Rico Juan, Ram N" sort="Rico Juan, Ram N" uniqKey="Rico Juan R" first="Ram N" last="Rico-Juan">Ram N Rico-Juan</name>
<affiliation wicri:level="4"><country xml:lang="fr">Espagne</country>
<wicri:regionArea>Dpto Lenguajes y Sistemas Informáticos, Universidad de Alicante</wicri:regionArea>
<placeName><settlement type="city">Alicante</settlement>
<region nuts="2" type="region">Communauté valencienne</region>
</placeName>
<orgName type="university">Université d'Alicante</orgName>
</affiliation>
<affiliation wicri:level="1"><country wicri:rule="url">Espagne</country>
</affiliation>
</author>
</analytic>
<monogr></monogr>
<series><title level="s">Lecture Notes in Computer Science</title>
<imprint><date>2010</date>
</imprint>
<idno type="ISSN">0302-9743</idno>
<idno type="eISSN">1611-3349</idno>
<idno type="ISSN">0302-9743</idno>
</series>
<idno type="istex">492AFA211AC0CD68A47DF8F47C37AC9C5B0ED5F4</idno>
<idno type="DOI">10.1007/978-3-642-14980-1_74</idno>
<idno type="ChapterID">74</idno>
<idno type="ChapterID">Chap74</idno>
</biblStruct>
</sourceDesc>
<seriesStmt><idno type="ISSN">0302-9743</idno>
</seriesStmt>
</fileDesc>
<profileDesc><textClass></textClass>
<langUsage><language ident="en">en</language>
</langUsage>
</profileDesc>
</teiHeader>
<front><div type="abstract" xml:lang="en">Abstract: This paper presents a new fast algorithm to compute an approximation to the median between two strings of characters representing a 2D shape and its application to a new classification scheme to decrease its error rate. The median string results from the application of certain edit operations from the minimum cost edit sequence to one of the original strings. The new dataset editing scheme relaxes the criterion to delete instances proposed by the Wilson Editing Procedure. In practice, not all instances misclassified by its near neighbors are pruned. Instead, an artificial instance is added to the dataset expecting to successfully classify the instance on the future. The new artificial instance is the median from the misclassified sample and its same-class nearest neighbor. The experiments over two widely used datasets of handwritten characters show this preprocessing scheme can reduce the classification error in about 78% of trials.</div>
</front>
</TEI>
<affiliations><list><country><li>Cuba</li>
<li>Espagne</li>
</country>
<region><li>Communauté valencienne</li>
</region>
<settlement><li>Alicante</li>
</settlement>
<orgName><li>Université d'Alicante</li>
</orgName>
</list>
<tree><country name="Cuba"><noRegion><name sortKey="Abreu Salas, Ignacio" sort="Abreu Salas, Ignacio" uniqKey="Abreu Salas I" first="Ignacio" last="Abreu Salas">Ignacio Abreu Salas</name>
</noRegion>
<name sortKey="Abreu Salas, Ignacio" sort="Abreu Salas, Ignacio" uniqKey="Abreu Salas I" first="Ignacio" last="Abreu Salas">Ignacio Abreu Salas</name>
</country>
<country name="Espagne"><region name="Communauté valencienne"><name sortKey="Rico Juan, Ram N" sort="Rico Juan, Ram N" uniqKey="Rico Juan R" first="Ram N" last="Rico-Juan">Ram N Rico-Juan</name>
</region>
<name sortKey="Rico Juan, Ram N" sort="Rico Juan, Ram N" uniqKey="Rico Juan R" first="Ram N" last="Rico-Juan">Ram N Rico-Juan</name>
</country>
</tree>
</affiliations>
</record>
Pour manipuler ce document sous Unix (Dilib)
EXPLOR_STEP=$WICRI_ROOT/Ticri/CIDE/explor/OcrV1/Data/Main/Exploration
HfdSelect -h $EXPLOR_STEP/biblio.hfd -nk 000759 | SxmlIndent | more
Ou
HfdSelect -h $EXPLOR_AREA/Data/Main/Exploration/biblio.hfd -nk 000759 | SxmlIndent | more
Pour mettre un lien sur cette page dans le réseau Wicri
{{Explor lien |wiki= Ticri/CIDE |area= OcrV1 |flux= Main |étape= Exploration |type= RBID |clé= ISTEX:492AFA211AC0CD68A47DF8F47C37AC9C5B0ED5F4 |texte= A New Editing Scheme Based on a Fast Two-String Median Computation Applied to OCR }}
This area was generated with Dilib version V0.6.32. |